Iterative reconstruction of speech from short-time Fourier transform phase and magnitude spectra
نویسندگان
چکیده
In this paper, we consider the topic of iterative, one dimensional, signal reconstruction (specifically speech signals) from the magnitude spectrum and the phase spectrum. While this topic has been extensively researched and documented, we wish to recast some well-established results for the benefit of new researchers and those who desire a short, yet comprehensive, review of the subject. The three main points of the review are: (i) a signal can be reconstructed to within a scale factor from its phase spectrum, (ii) a signal cannot be reconstructed to within a scale factor from its magnitude spectrum, and (iii) a signal can be reconstructed to within a scale factor from its magnitude spectrum when the phase-sign (i.e., one bit of phase spectrum information) is known. Through a number of illustrative examples, we first demonstrate how the algorithms work when the spectral information is determined over the entire duration of the signal. We then demonstrate that the algorithms are equally valid for reconstruction of a signal from the spectra obtained from short-time segments. In addition, we present the results of some further experimentation in which we have attempted to reconstruct a speech signal from only partial phase spectrum information (in the absence of all magnitude spectrum information). We make the following observations: (i) intelligible signal reconstruction (albeit noisy) is possible from knowledge of only the phase spectrum sign information, (ii) an intelligible signal cannot be reconstructed from knowledge of only the phase spectrum frequency-derivative or only the phase spectrum time-derivative, and (iii) an intelligible signal can be reconstructed from the combined knowledge of both the phase spectrum frequency-derivative and time-derivative. 2006 Elsevier Ltd. All rights reserved.
منابع مشابه
Some experiments on iterative reconstruction of speech from STFT phase and magnitude spectra
In our earlier work, we have measured human intelligibility of stimuli reconstructed either from the short-time magnitude spectra or short-time phase spectra of a speech signal. We demonstrated that, even for small analysis window durations of 20-40 ms (of relevance to automatic speech recognition), the short-time phase spectrum can contribute to speech intelligibility as much as the short-time...
متن کاملUsefulness of Phase Spectrum in H
Short-time Fourier transform of speech signal has two components: magnitude spectrum and phase spectrum. In this paper, relative importance of short-time magnitude and phase spectra on speech perception is investigated. Human perception experiments are conducted to measure intelligibility of speech tokens synthesized either from magnitude spectrum or phase spectrum. It is traditionally believed...
متن کاملUsefulness of phase in human speech perception
Short-time Fourier transform of speech signal has two components: magnitude spectrum and phase spectrum. In this paper, relative importance of short-time magnitude and phase spectra on speech perception is investigated. Human perception experiments are conducted to measure intelligibility of speech tokens synthesized either from magnitude spectrum or phase spectrum. It is traditionally believed...
متن کاملOn the usefulness of STFT phase spectrum in human listening tests
The short-time Fourier transform (STFT) of a speech signal has two components: the magnitude spectrum and the phase spectrum. In this paper, the relative importance of short-time magnitude and phase spectra for speech perception is investigated. Human perception experiments are conducted to measure intelligibility of speech stimuli synthesized either from magnitude spectra or phase spectra. It ...
متن کاملAnalysis of signal reconstruction after modulation filtering
When the short-time Fourier transform (STFT) of an audio signal is arbitrarily modified, it no longer truly represents a time-domain signal. Classically, the accepted solution to obtain a time-domain signal from a modified STFT (MSTFT) is to invert the MSTFT to a time-domain signal that has an STFT that is closest to the MSTFT in a least squares sense. This is also the approach currently taken ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Computer Speech & Language
دوره 21 شماره
صفحات -
تاریخ انتشار 2007